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ABSTRACT 


It  is  proposed  that  a  hybrid  sensory  feedback  system  comprising  a  visual  peripheral 
component  together  with  a  haptic  component  corresponding  to  that  of  visual  foveal  in¬ 
formation.  is  equivalent  to  that  of  full  visual  sensory  feedback.  Such  a  system  is  con¬ 
structed  and  the  ability  of  subjects  to  perceive  objects  using  it  is  investigated  by 
observing  and  classifying  their  search  strategy.  .Although  the  provision  of  a  peripheral 
component  provides  advantages  over  a  purely  haptic  system,  it  is  concluded  that  sub¬ 
jects  rely  heavily  on  the  haptic  data,  and  the  resulting  hybrid  system  is  not  equivalent 
to  full  vision. 
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I.  INTRODUCTION 


A.  REMOTELY  OPERATED  VEHICLES  (ROV) 

1.  Background 

ROVs  arc  utilized  to  conduct  underwater  tasks  where  it  is  necessary  or  prefera¬ 
ble  to  avoid  human  presence.  Such  tasks  usually  revolve  around  situations  in  hazardous 
or  dangerous  environments.  ROVs  have  found  wide  use  in  the  olT  shore  industry,  and 
to  a  lesser  extent,  in  the  militars'  and  scientific  research  communities.  .Applications  of 
ROVs  include  inspection,  monitoring,  survey,  search,  identification,  retrieval.  Four 
classes  of  ROVs  have  been  identified;  tethered  vehicles,  free  swimming  vehicles,  bottom 
crawling  vehicles,  and  untethered  vehicles.  Figure  1  shows  an  example  of  a  tethered 
ROV  with  a  manipulator  arm.  While  ROVs  provide  a  significant  increase  in  capabilities 
over  a  diver  in  terms  of  greater  operating  range,  increased  time  on  station,  and  human 
safety,  the  manipulator's  inability  to  provide  detailed  haptic,  or  touch,  input,  creates 
difficulty  for  the  manipulator  operator  in  performing  dextrous  tasks.  Further,  under¬ 
water  tasks  are  frequently  performed  in  reduced  visibility,  thereby  limiting  object  recog¬ 
nition  ability.  This  lack  of  detailed  haptic  input  in  ROV’  manipulators  is  contrasted  with 
a  human  diver's  highly  developed  sense  of  touch  that  enables  a  diver  to  perform  com¬ 
plicated  manipulative  tasks  in  the  absence  of  visual  input.  This  situation  creates  the 
likelihood  that  future  generations  of  ROVs,  which  will  be  heavily  reliant  on  visual  feed¬ 
back,  may  not  offer  the  most  efficient  sensory  feedback  capabilities  for  telemanipulator 
operation. 

2.  Planned  Developments 

Future  ROV  developments  rely  on  the  concept  of  telepresence  for  manipulator 
operation.  Sensory  inputs  allow  the  operator  to  "feel  as  if  he  were  actually  present  at 
the  remote  location.  "  (Beierl,  1991.  p.4)  A  conceptual  example  of  a  future  generation 
ROV  teleoperation  system  is  shown  in  Figure  2.  The  system  is  comprised  of  a  master 
control  station  with  a  position-sensing,  force-reflective  controller  for  the  remote  station. 
The  remote  station  consists  of  a  manipulator  subsystem  involving  a  head,  torso,  and  two 
arms.  Hands  are  .nounted  on  the  arms,  and  consist  of  a  wrist,  thumb,  and  at  least  two 
fingers. 
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Figure  1.  Remotely  Operated  Vebkie  (Beieri,  1991, p.3) 
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B.  TELEOPERATION 

Teleoperation  requires  various  interdependent  components  that  provide  self¬ 
locomotion,  communication  capabilities,  and  the  ability  to  interject  human  presence  into 
the  area  of  interest.  To  study  in  detail  the  separate  functions  required  to  perform  these 
tasks,  teleoperation  can  be  broken  down  into  functional  categories.  Among  these  sub¬ 
systems  are  the  actuator,  control,  communication,  structural,  and  sensor.  It  is  the  sen¬ 
sor  subsystem  which  allows  the  man-machine  interface  permitting  human  intervention 
to  be  projected  into  a  remote  workspace.  While  human  sensory  receptors  include  the 
five  traditional  senses,  as  well  as  heat  detection  and  balance,  manipulator  sensors  are 
primarily  dedicated  to  visual,  acoustic,  and  haptic.  Each  of  these  has  its  own  unique 
capabilities  and  problem  areas  associated  with  undenvater  manipulator  work. 

1.  Visual  Sensing 

Most  tcleoperators  allow  for  direct  vision  by  the  human  operators,  for  opti¬ 
mum  interface,  this  requires  sulTicient  lighting,  a  problem  in  most  underwater  work  due 
to  the  absence  of  a  light  source  other  than  on  the  manipulator,  and  the  presence  of 
particulate  matter  in  the  water  that  causes  light  waves  to  scatter.  The  construction  of 
a  viewing  system  calls  into  question  several  factors  concerning  lighting  and  manipulator 
placement. 

.Air  Force  studies  have  shown  ...  the  distance  from  the  manipulator  operator  s  un¬ 
aided  eyes  to  the  work  should  not  be  greater  than  about  10  feet.  .As  distance  in¬ 
creases,  visual  resolution  and  depth  perception  drop  off  and  task  performance  time 
rises.  (Johnsen.  1971.  p.I51) 

2.  Acoustic  Sensing 

.A  sound  sensory  channel  ofFers  a  supplemental  source  of  information  not  al¬ 
ways  available  through  a  vision  system.  Sonar  provides  distance,  speed,  and  directional 
knowledge  about  an  object  in  water  conditions  that  would  render  a  sight  system  unusa¬ 
ble.  .An  imaging  sonar,  substituting  ultrasonic  sound  for  light,  is  analogous  to  television. 
This  type  of  system  locates  the  object  of  interest  by  mean''  of  a  sound  transducer.  Re¬ 
flected  sound  waves  are  captured  by  hydrophones  and  processed  into  electronic  signals 
capable  of  being  turned  into  a  visual  image.  The  main  limitations  arc  the  poor  image 
resolution  due  to  the  large  wavelength  of  sound  waves,  and  the  short  working  distances 
due  to  the  rapid  attenuation  of  sound  waves  in  seawater.  (Johnsen,  1971,  pp.l58-159) 

3.  Haptic  Sensing 

In  spite  of  the  presence  of  other  sensory  inputs,  human  divers  arc  known  to  re¬ 
ceive  the  most  information  through  their  sense  of  touch.  This  is  known  as  the  haptic 
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sense,  which  consists  of  the  divers'  tactile  sense  -  feedback  generated  by  contact  with  an 
object  -  and  knowledge  gained  through  the  body's  position  and  orientation,  known  as 
kinesthesis.  The  ability  of  an  ROV  operator  to  duplicate  this  level  of  information  gath¬ 
ering  sensitivity  is  dependent  on  both  the  type  and  composition  of  the  manipulator. 
Terminus  type  feedback  is  transmitted  from  the  end  effector  and  allows  the  operator  to 
only  sense  an  object  or  constraint  located  at  the  end  of  the  manipulator.  More  complex, 
anthropomorphic  man-machine  interfaces  allow  force  transmission  which  result  from  the 
orientation  of  the  manipulator.  Structural  characteristics  of  the  manipulator  such  as 
rigidity,  friction,  inertia,  and  size,  also  reduce  the  degree  of  sensitivity  of  the  manipulator 
as  compared  to  a  human  hand  arm. 
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II.  THEORY 


The  human  ability  to  recognize  and  identify  an  object  is  dependent  upon  an  inter¬ 
woven  network  of  information  provided  by  the  five  senses.  This  information  is  collected 
through  external  stimulus  from  the  surrounding  environment  and  combined  with 
internal  body  sensations  such  as  balance,  orientation,  and  equilibrium.  The  complexities 
involved  in  understanding  this  highly  individualistic  process  combines  both  the  "science" 
of  physiology  and  the  "art"  of  psychology.  External  sensorx'  stimulus  produces  a  mental 
image  which  is  compared  with  a  known  internal  image  from  the  human  memorx. 
Comparison  of  the  dilTerences  between  the  perceived  and  known  images  is  the  recogni¬ 
tion  process. 

A.  V  ISUAL  RECOGNITION 

Most  research  into  human  perception  and  recognition  has  focused  on  visual  obser¬ 
vation.  Sight  begins  when  reflected  light  waves  from  a  \  iewed  object  pass  through  the 
cornea,  the  thin  transparent  tissue  which  acts  as  the  eye's  fixed  outer  lens.  The  cornea 
bends  the  light  waves  which  then  pass  through  the  iris,  the  shutter-like  device  which 
controls  the  amount  of  light  that  enters  the  pupil.  The  last  stage  of  focusing  is  accom¬ 
plished  by  the  bending  of  the  light  waves  through  a  cry  stalline  lens  located  behind  the 
iris.  The  light  waves  then  fall  on  the  retina,  a  thin  sheet  of  neural  tissue  at  the  back  of 
the  eye  over  which  the  image  is  displayed.  Lying  in  the  center  of  the  macula  (the  yellow 
spot  of  the  retina)  is  the  fovea,  which  contains  a  highly  concentrated  array  of 
photoreceptor  cells.  It  is  the  foveal  vision  component  which  provides  the  narrow,  cen¬ 
tral  field  of  focused  vision.  Detailed  visual  information  is  received  only  through  the 
narrow  (1-2®)  fovea,  therefore  the  eye  must  scan  the  object  (unless  it  subtends  only  a 
veiy  small  angle  of  the  visual  field)  in  order  to  provide  information.  These  eye  move¬ 
ments  are  called  saccades  and  occur  ver>'  rapidly  while  accounting  for  only  10"  o  of  the 
viewing  time.  "During  normal  viewing  of  stationary  objects,  the  eye  alternates  between 
fixations  ...  and  rapid  movements  called  saccades".  (Noton  and  Stark.  1971,  p.34) 

Since  the  fovea  encompasses  such  a  limited  range,  the  majority  of  the  visual  field 
does  not  provide  detailed  description  of  an  object.  This  larger  portion  is  the  peripheral 
component  and  is  used  in  establishing  a  sense  of  relative  spatial  order  of  the  object.  It 
is  this  combination  of  these  two  components  that  enables  the  reader  to  both  focus  on 
the  lines  of  text  (foveal)  and  immediately  shift  from  the  end  of  one  line  to  the  beginning 
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to  the  next  line  (peripheral).  Experiments  by  Watanabe  have  shown  that  if  only  a  foveal 
component  is  permitted,  the  visual  search  becomes  slower  and  more  sequential  than 
when  both  components  are  present 

The  ability  to  inspect  fine  detail  without  a  sense  of  the  larger  total  object  contradicts 
the  Gestalt  theory  that  objects  are  identified  by  their  complete  state  vice  any  analysis 
of  their  features.  More  recently,  the  Gestalt  approach  has  been  theorized  to  hold  only 
for  more  simple  objects,  and  those  that  are  well  known  to  the  observer.  The  support  for 
a  more  sequential  search  has  been  shown  in  experiments  where  the  complexity  of  the 
viewed  object  is  varied.  Subjects  have  been  measured  to  require  a  longer  time  to  identify 
more  complicated  objects,  which  follows  from  the  need  to  check  more  individual  com¬ 
ponents.  It  has  also  been  shown  that  a  subject  takes  longer  to  recognize  a  previously 
specified  object  than  to  reject  a  non-prescribed  object.  In  a  sequential  search  of  a  pre¬ 
scribed  object,  each  component  must  be  compared  with  the  corresponding  part  of  the 
specified  object,  whereas  the  presence  of  only  a  few  non-matching  features  enables  the 
observer  to  reject  the  object  as  being  different.  Both  these  results  conflict  with  the 
Gestalt  theory . 

The  supposition  that  visual  perception  and  recognition  are  composed  of  fairly  or¬ 
dered  and  identifiable  fi.xed  paths  -  called  "scan  paths"  (Noton  and  Stark.  1971)  was  de¬ 
veloped  by  experiments  that  in  general  show  that  observers  do  not  follow  a  random 
viewing  path,  figure  3  shows  the  recorded  fixations  of  an  observer  looking  at  the 
drawing  of  a  polygon  and  the  sequence  of  the  fixations  in  an  eight  second  time  frame. 
The  scan  path  is  clearly  discernable  in  fixations  4  through  11  and  II  through  18.  While 
scanpaths  were  not  always  observed,  the  tendency  was  for  the  observer  to  exhibit  a 
scanpath. 

B.  HAPTIC  RECOGNITION 

The  ability  to  detect  one's  surroundings  through  bodily  contact  is  known  as  the 
haptic  system.  Haptic,  from  the  Greek  "able  to  lay  hold  of,  is  defined  as  "the  perceptual 
system  by  which  animals  and  men  are  literally  in  touch  with  the  environment"  (Gibson. 
1966,  p.97).  The  haptic  system  encompasses  the  entire  body  -  muscles,  joints,  skin  -  and 
provides  information  on  the  interaction  between  a  body  and  its  environment.  In  hu¬ 
mans,  the  two  primary  parts  of  the  haptic  system  are  the  tactile  receptors  and  the 
physical  structure  of  the  body.  The  haptic  system,  unlike  other  perceptual  systems  such 
as  the  auditory  or  taste-smell,  is  both  a  passive  and  active  system.  The  passive  mode 
detects  motion,  contact,  proximity,  and  in  general  the  source  of  the  stimulation.  .Active 
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Figure  3.  Foveal  Positions  and  Order  of  Saccades(Noton  and  Stark,  197 l,p.36) 
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perception,  or  exploratory  search,  detects  tangible  physical  properties  such  as  object 
size,  shape,  surface  texture,  and  hardness. 

Proprioception,  or  kinesthesis.  the  awareness  of  sensation,  has  several  forms.  .Mus¬ 
cular  proprioception  is  the  body's  ability  to  judge  muscle  tension  and  force.  .Articular 
proprioception  detects  the  body's  position  through  joint  angles.  Vestibular 
proprioception  includes  receptors  in  the  inner  ear  which  provide  for  balance  and  equi¬ 
librium.  Cutaneous  proprioception  is  the  "touch"  or  tactile  sensation  whereby 
subcutaneous  mechanoreceptors  are  stimulated  by  contact  with  or  proximity  to  an  ob¬ 
ject. 

The  haptic  sensation  cannot  provide  the  detailed  analysis  of  the  foveal  vision  com¬ 
ponent.  however  its  ability  to  provide  information  to  the  recognition  process  should  not 
be  considered  unimportant  or  subordinate  to  vision.  Indeed,  the  haptic  ability  of  the 
sightless  to  provide  comparable  perception  is  an  indicator  of  its  power.  "Haptics  is  not 
so  inferior  to  optics  ...  since  the  blind  depend  upon  it  for  a  whole  realm  of  useful  per¬ 
ception. "  (Revesz,  1950) 

C.  COMP.ARISON  OF  SEARCH  MODES 

The  localized  information  provided  by  haptic  sensing  can  be  considered  analogous 
to  the  narrow  scope  of  the  foveal  field.  Research  with  a  force-refiecting  telcmanipulator. 
\shich  provides  a  kinesthetic  sense,  has  shown  the  highly  sequential  search  strategy 
characteristic  of  foveal-only  search.  (.Acosta.  1991)  The  analogy  between  haptic  and 
visual  search  and  the  possibility  for  modeling  full  vision  through  a  combined 
haptic  vision  system  is  the  aim  of  this  research. 

1.  Search  Descriptors 

Several  difierent  qualitative  and  quantitative  measurements  have  been  devel¬ 
oped  to  analyze  a  subject's  search  strategy  once  specific  fixation  points  have  been  lo¬ 
cated.  One  of  these  is  the  code  circle,  which  characterizes  the  manner  of  the  search. 
Figure  4  shows  an  example  of  a  search  path  and  code  circle  for  the  letter  ".A ".  The  first 
draw’ing  shows  the  lower  case  letters  which  represent  the  individual  features  of  the  ob¬ 
ject.  The  second  drawing  is  the  search  path  of  the  letter.  The  arrow  pointing  inwards 
at  the  low’er  left  comer  of  the  "A"  represents  where  the  initial  contact  was  made.  The 
dashed  line  indicates  where  the  subject  broke  contact  w’ith  the  object  after  fixating  on 
the  right  side  ("f')  and  then  regained  contact  on  the  lower  right  horizontal  leg  ( "h ").  The 
outward  arrow'  denotes  the  last  fixation  prior  to  completion  of  the  search.  Features 
which  are  searched  sequentially  are  indicated  by  connecting  lines  on  the  outside  of  the 
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code  circle.  A  search  with  scanpath  tendencies  is  represented  by  lines  across  the  interior 
of  the  code  circle.  An  interrupted  search  is  shown  by  a  connecting  line  on  the  outside 
of  the  circle  that  does  not  connect  two  adjacent  features.  The  third  picture  shows  the 
code  circle,  with  lower  case  letters  located  around  the  perimeter  corresponding  to  the 
individual  object  features.  The  smaller  circle  contains  the  features  for  the  internal  tri¬ 
angular  pocket.  The  progression  of  the  arrows  on  the  code  circle  reflects  the  sequence 
of  fi.vations.  (Acosta,  1991,  pp.52-54) 

Another  method  of  examining  search  strategies  is  to  assign  a  character  string  to 
the  sequence  of  fixations.  In  the  previous  example,  the  sequence  of  fixations  is  repres¬ 
ented  by  the  string  [aqknbdebfh].  By  comparing  this  string  to  a  previously  defined  one. 
the  similarity  of  the  two  sequences  can  be  quantified  through  means  of  string  editing, 
which  examines  the  "cost"  of  transforming  the  observed  into  the  predefined  string. 

Editing  a  string  has  three  basic  operations  -  substitution,  deletion,  and  insertion. 

.A  "cost"  for  each  such  operation  must  be  defined.  For  example,  substitutions  are 
assigned  a  cost  of  "2",  deletions  and  additions  a  cost  of  "1".  To  then  transform  a 
string  observed  as  [A  C  A]  the  previously  defined  string  [C  A  D  A  C]  requires  in¬ 
serting  a  "C"  at  the  beginning  and  at  the  end  (cost  "1"  each),  and  substituting  a  "D" 
for  the  middle  "C"  (cost  "2"). 

By  defining  the  value  of  the  sum  of  operations  as  the  "distance"  between  two  figures,  a 
comparison  of  the  distances  obtained  from  various  observations  can  establish  the  "sim¬ 
ilarity  of  the  sequence  of  visual  fixations".  (Hacisalihzde,  Stark,  Allen,  1990,  p.7) 

A  method  of  determining  the  progression  of  the  search  from  one  observed  fea¬ 
ture  to  the  next  is  the  sequence  ratio,  Sr.  This  is  defined  as  the  number  of  sequential 
fixations  divided  by  the  quantity  of  the  total  number  of  fixations  minus  one.  Therefore, 
since  the  sequence  ratio  lies  between  zero  and  one,  it  may  be  expressed  as  a  percentage. 
For  example,  using  the  object  in  Figure  4,  the  sequence  [defghjklkJmopopqrstu]  has  IS 
sequential  features,  therefore  a  sequence  ratio  of  90*^0.  This  was  used  in  comparing  the 
full  vision  search,  which  with  its  saccadic  tendencies  has  a  low  Sr  (?:10%),  to  a  haptic- 
only  (or  foveal-only)  search,  where  the  search  is  highly  ordered  and  sequential,  and  has 
a  high  Sr  (a:95%). 

2.  Foveal  Visual  Search 

Work  performed  by  Watanabe  examined  the  observations  of  subjects  when  their 
vision  had  been  modified.  Using  equipment  that  showed  the  location  of  where  a  subject 
was  looking,  and  then  masking  either  the  foveal  or  peripheral  component,  Watanabe 
was  able  to  determine  changes  in  the  search  strategy.  When  full  vision  was  allowed,  the 
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Fi«iire  4.  Search  Path  and  Code  Circle  (Acosta.  1991.  p.53) 
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visual  fixations  closely  resembled  the  saccadic  tendencies  of  the  scanpath  theory  of 
Noton  and  Stark.  In  the  experiments  where  the  foveal  component  was  masked,  the 
fixations  oscillate  to  the  left  and  right  of  the  target  in  order  to  try  and  obtain  the  detailed 
information  which  is  absent  when  looking  directly  at  the  object.  Video  recordings  of 
Watanabe  s  work  show  that  when  the  peripheral  component  was  masked,  the  subject  s 
fixations  slowed  and  became  more  sequential  in  nature.  .A  subject  seen  reading  from  a 
book  is  unable  to  proceed  directly  from  the  end  of  one  line  of  text  to  the  beginning  of 
the  next  line.  The  recorded  visual  search  slowly  looked  for  a  continuous  path  in  the  di¬ 
rection  of  the  left-hand  side  of  the  page,  and  then  vertically  towards  the  proper  line, 
there  was  no  evidence  of  any  scanpath  characteristics  present.  This  foveal-only  search 
exhibited  similar  patterns  as  the  haptic-only  explorations  studied  by  .Acosta,  thus  sup¬ 
porting  the  hypothesis  that  a  haptic  input  could  be  an  adequate  substitute  for  the  foveal 
vision  component. 

3.  Haptic  Search 

Similar  to  the  concept  of  two  supporting  subsystems  for  visual  search,  the 
haptic,  or  touch  system  can  also  be  thought  of  as  having  two  separate  channels,  the 
tactile  system,  and  the  kinesthetic  system,  which  obtains  information  through  the  spatial 
orientation  of  body  parts.  In  order  to  study  the  effect  of  each  haptic  sensory  system. 
Oriels  and  Spain  developed  experimental  work  that  decoupled  the  tactile  mode  from  the 
kinesthetic  mode  by  having  subjects  use  a  tclemanipulator  (conceptualized  in  Figure  5) 
to  identify  remote  objects.  The  telemanipulator  provided  force  feedback  through  system 
of  antagonistic  cables  and  pulleys  which  reproduced  the  operator's  movements.  Qual¬ 
itative  observations  of  the  tests  conducted  led  to  the  supposition  that  object  identifica¬ 
tion  is  initially  based  upon  an  accumulation  of  knowledge  about  individual  features. 
This  non-Gestalt  approach  was  further  developed  in  subsequent  work  by  .Acosta  who 
showed  that  such  a  decoupled  haptic  system  caused  the  subject  to  search  in  much  the 
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same  highly  sequential  fashion  as  the  foveal-only  vision  search.  The  lack  of  a  global  " 
recognition  capability  to  see  the  object  in  its  entirety  thus  becomes  comparable  to  the 
lack  of  a  peripheral  vision  component  in  visual  object  search. 

4.  Hybrid  Sensory  System 

The  similarity  between  the  foveal  visual  search  done  by  Watanabe  and  the 
haptic  search  work  done  by  Acosta  indicates  that  substitution  of  a  haptic  sensory  input 
for  the  foveal  vision  component  can  be  used  in  a  model  for  a  full  vision  system.  .A  de¬ 
graded  visual  cue,  representing  the  peripheral  component,  will  provide  the  gross  spatial 
information;  a  haptic  input  will  provide  the  detailed  narrow  information  required  for 
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Figure  5.  Conceptualization  of  Telemanipulator 
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individual  feature  recognition.  A  method  of  combining  these  two  modes  into  a  hvbrid 
full  vision  system  can  be  tested,  and  if  successful,  provide  a  means  for  improving  the 
sensory  acquisition  of  an  ROV  at  a  reduced  cost.  .An  equivalent  full  vision  system  would 
not  require  the  same  quality  visual  sensors  necessary  for  foveal  vision  in  a  direct  viewing 
sy  stem.  Lesser-grade  optical  sensors  combined  with  a  haptic  sensory  input  would  result 
in  lower  cost  while  providing  the  same  sensory  capabilities  as  a  full  vision  system.  In¬ 
deed,  in  many  environmental  conditions,  even  the  most  well-designed  optics  may  not 
yield  sufficient  resolution  to  allow  remote  foveal  recognition.  .A  remote  haptic  channel 
is  not  subjected  to  the  same  visual  limitations  as  the  normal  foveal  component,  hence 
it  provided  a  more  elTicient.  possibly  less  expensive  means  to  accomplish  detailed  object 
recognition. 

D.  OBJECTIVES  OF  THESIS 

The  objective  of  this  research  is  to  c.xamine  the  search  strategy  of  a  hybrid 
haptic  visual  system  to  determine  future  sensor  requirements  for  the  next  generation  re¬ 
motely  operated  vehicles.  The  substitution  of  haptic  feedback  for  the  foveal  vision 
component  is  analy  zed  to  determine  whether  such  a  system  is  an  adequate  model  for  full 
visual  search.  This  sy  stem  would  provide  a  much  less  costly  alternative  to  a  higher  grade 
optical  sensor  sy  stem  and  would  prove  more  useful  in  environments  where  vision  is  re¬ 
stricted  by  water  conditions. 
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III.  E.XPERIMENTAL  DEVELOPMENT 


A.  OVERVIEW 

Based  upon  the  previous  data  of  haptic  search  approximating  the  foveal  vision 
component,  a  model  for  full  vision  was  developed  using  a  computer  based  vision  svstem 
to  simulate  the  peripheral  vision  component.  Test  subjects  were  shown  a  digitized  video 
image  of  the  object  to  be  identified.  In  order  to  simulate  the  unfocused  nature  of  the 
peripheral  component,  a  computer  generated  program  was  employed  to  digitize  the  ob¬ 
ject  into  a  ’mosaic"  or  tile  pattern.  This  allowed  the  subject  to  sense  the  general  size  and 
shape  of  the  object,  but  did  not  give  sufficient  detail  to  allow  for  recognition.  .-\  force- 
rellecting  telemanipulator  utilizing  haptic  recognition  to  provide  detailed  feature  infor¬ 
mation  was  considered  as  an  alternative  for  the  foveal  vision  component.  The  combined 
nature  of  these  two  sensory  inputs  as  an  acceptable  model  for  full  vision  search  was 
analyzed  by  means  of  quantitative  measures  of  recognition. 

B.  SVSTEM  COMPONENTS 
1.  Telemanipulator 

.A  seven  DOF,  CRL  force-reflecting  telemanipulator  of  the  terminus  type  was 
used  in  this  research.  A  cable  pulley  system  allowed  the  operator  to  sense  the  forces 
experienced  at  the  end  effector  through  a  pistol  grip  handle.  .A  parallel  gripper  locked 
a  plastic  brace-mounted  steel  probe.  .A  high  current  LED  was  mounted  in  the  end  of  the 
one-quarter  inch  diameter  probe.  The  LED  provided  visual  feedback  to  the  observer  via 
projection  camera  monitor  during  the  combined  search  mode.  Figure  6  shows  a  sche¬ 
matic  of  the  equipment  set-up.  The  LED  also  served  to  reduce  the  friction  between  the 
probe  and  the  taskboard.  Figure  7  shows  the  telemanipulator  and  taskboard  arrange¬ 
ment. 

Haptic  probing  is  accomplished  by  decoupling  the  tactile  sensory  system  from 
the  proprioceptive  system  by  placing  the  telemanipulator  between  the  operator  and  the 
taskboard.  This  serves  as  a  sensory  filter  and  experiments  by  Driels  and  Spain  have 
shown  that  when  compared  to  direct  manipulation  of  an  object,  the  effect  of  haptic-only 
probing  is  a  degradation  of  the  subject  s  proficiency  in  identifying  an  object. 

Several  mechanical  effects  of  the  telemanipulator  contribute  to  the  subject's  re¬ 
duced  ability  to  recognize  objects.  Friction  between  the  probe  and  the  taskboard  causes 
"mechanical  noise  "  which  makes  object  recognition  more  dilTicult.  If  the  subject  pushes 


Figure  7.  Telemanipulator  and  Task  Board  Arrangement 
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the  probe  tip  with  too  great  a  force  in  the  direction  normal  to  the  plane  of  the  taskboard, 
the  ability  to  distinguish  the  contact  force  with  the  edge  of  an  object  and  the  normal 
friction  force  is  lost.  Familiarity  with  the  necessary  force  required  to  just  maintain 
contact  on  the  surface  of  the  taskboard  is  an  essential  element  of  subject  training.  The 
size  of  the  individual  features  of  the  object  must  be  sufficiently  large  enough  that  the  end 
elfector  can  examine  it  in  detail.  The  elTect  which  is  most  dependent  upon  operator  skill 
with  the  manipulator  is  inertia.  The  mass  of  the  manipulator  arm  makes  it  very  slow  to 
respond;  until  proper  skill  level  is  developed,  the  ability  to  detect  abrupt  changes  in  fea¬ 
ture  orientation,  such  as  in  exterior-angled  corners,  is  limited.  The  tendency  to  "over¬ 
shoot"  an  exterior  corner  of  an  object  is  quite  common  during  the  training  phase,  and 
even  well  trained  operators  sutfer  from  an  occasional  loss  of  proper  probing  speed  con¬ 
trol  and  consequently  miss  the  corner  of  an  object. 

2.  Vision  System 

To  simulate  the  peripheral  vision  component,  a  method  to  degrade  the  image 
quality  had  to  be  developed.  .A  computer  program  using  the  intrinsic  conamands  of  the 
Intellede.x  S  Intellevue  203  Vision  System  was  used.  This  system  utilized  a  variation 
of  the  Microsoft  2  B.ASIC  language  called  Vision  B.\SIC  The  specific  commands 
issued  included: 

-VSNAP:  ,-\n  image  acquisition  command  that  writes  the  real-time  digitized  image 
currently  seen  through  the  camera  to  the  display  R.A.Vf. 

-VDIG:  An  image  processing  command  issued  after  VSN.AP  that  displays  the  contents 
of  the  display  R.AM.  Once  this  command  is  issued,  no  changes  in  the  camera's  viewing 
field  affect  the  displayed  image. 

-VPPEEK:  .An  image  processing  and  display  command  that  samples  the  gray-scale 
value  of  the  specified  pixel. 

-VTPOKE:  .An  image  processing  and  display  command  that  returns  a  specified  gray¬ 
scale  value  to  the  indicated  pixel  in  the  display  R/\.M. 

The  complete  program,  listed  in  .Appendix  ,A,  sampled  each  of  the  pixels  in  a 
predetermined  block  size  (31x30  in  this  research),  averaged  the  sum  of  the  gray-scale 
values  of  all  930  pixels  in  the  block,  then  returned  that  average  value  to  each  pixel  in  the 
block. 

Once  the  program  had  completed  running,  a  general  purpose  system  function 
command,  VBOTH,  was  issued.  This  command  displayed  the  digitized  image  currently 
in  the  display  RA.M  superimposed  upon  the  actual  image.  This  allowed  the  movement 
of  the  probe  to  be  seen  on  the  monitor  at  the  same  time  as  the  digitized  image.  So  as 
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not  to  have  the  image  of  the  actual  letter  appear  on  the  monitor,  the  f-stop  on  the 
projection  camera  was  set  to  the  smallest  opening.  The  brightness  of  the  LI£D  in  the 
probe  tip  was  then  the  only  "live"  image  which  appeared  on  the  monitor.  To  further 
ensure  that  no  other  part  of  the  "live"  image  came  through,  a  large  black  cloth  screen 
with  a  dull  surface  was  mounted  on  the  wall  behind  the  manipulator,  reducing  the 
amount  of  light  rellected  off  the  white  wall. 

3.  \'ideo  Monitoring 

In  order  to  analy.^e  a  subject's  search  strategy,  a  video  recording  of  each  run 
was  made.  .A  video  recorder  was  mounted  behind  the  taskboard  and  recorded  the 
movement  of  the  probe  tip  as  each  subject  tried  to  identify  the  specified  object.  Because 
the  probe  tip  was  not  visible  from  in  front  of  the  task  board,  the  camera  had  to  be  po¬ 
sitioned  behind  the  task  board.  To  have  the  recording  appear  in  the  correct  orientation, 
and  not  in  the  reverse  image,  an  18  inch  square  mirror  was  mounted  to  the  wooden 
frame  of  the  task  board.  Figure  8  shows  the  position  of  the  camera  shooting  an  image 
of  the  letter  "B"  from  the  reflection  in  the  mirror.  This  arrangement  allowed  both  the 
video  recorder  and  the  projection  camera  to  be  out  of  the  way  of  the  telcmanipulator 
and  permitted  the  operation  of  the  video  recorder  away  from  the  field  of  \  iew  of  the 
subject  in  the  combined  search  mode.  .An  e.xternal  microphone  was  used  to  capture  the 
verbal  comments  of  the  subjects  to  provide  additional  clarification  of  the  search  strategy. 

A  significant  amount  of  time  vvas  required  to  initially  develop  the  arrangement 
of  all  equipment  in  order  to  achieve  proper  lighting  for  both  the  projection  camera  and 
the  video  recorder.  The  background  light  available  during  daytime  provided  a  much 
different  source  than  the  overhead  fluorescent  lighting  used  at  night.  .Adjustments  to  the 
task  board,  mirror,  and  camera  positions  were  required  prior  to  each  session  to  obtain 
the  highest  resolution  picture. 

C.  SUBJECT  TRAINING 

.A  dedicated  training  phase  was  required  of  all  subjects  in  order  to  develop  familiar¬ 
ization  with  the  operation  of  the  telemanipulator,  whose  large  mass  and  length  yielded 
a  significant  amount  of  inertia  to  overcome,  certainly  more  so  than  a  human  arm.  The 
ability  to  make  slight  adjustments  and  understand  the  time  lag  response  of  the  manipu¬ 
lator  required  considerable  practice. 

Initial  training  was  conducted  by  having  the  subject  look  at  the  end  effector  while 
probing  an  object.  After  that  initial  exposure  to  the  force  feedback  generated  by  contact 
with  the  object  edge  and  the  task  board,  the  subject  was  shielded  from  the  task  board 
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by  a  large  curtain.  Further  practice  helped  the  subject  develop  confidence  in  identifying 
the  letters,  which  were  placed  normal,  upright  position.  After  observing  the  subject  ex¬ 
hibit  a  level  of  skill  that  enabled  them  to  identify  the  object  with  a  90%  confidence  level, 
the  letter  was  rotated  to  randomly  selected  orientations.  Once  a  similar  level  of  profi¬ 
ciency  was  shown,  actual  data  collection  commenced. 

The  integration  of  the  visual  search  component  was  accomplished  by  first  showing 
the  subject  what  a  processed  image  looked  like,  and  then  conducting  several  practice 
runs.  To  reduce  the  any  possible  biases,  the  subjects  were  never  told  that  any  of  their 
efforts  were  not  being  video  taped.  Standardization  between  all  runs  was  emphasized 
to  ensure  no  subconscious  changes  in  the  subject's  search  habits. 

D.  EXPERIMENT  PROCEDURES 

Three  operators  were  trained  as  subjects  for  this  research.  Each  subject  was  given 
a  similar  series  of  objects  to  identify.  The  objects  chosen  were  nine  inch  long  block 
capital  letters  that  were  randomly  oriented  on  the  taskboard  to  avoid  having  the  subject 
"guess"  the  object  based  upon  initial  probing  had  the  letter  been  in  an  a  normal  upright 
position.  Letters  of  the  alphabet  were  chosen  so  as  to  provide  a  common  object  set  for 
all  subjects.  Utilizing  a  set  of  objects  that  do  not  have  the  same  recognizability  to  all 
observers  adds  a  possible  bias  to  the  subjects'  search  pattern  and  which  would  be  unde¬ 
tectable  in  any  of  the  measures  of  recognition.  To  ensure  standardization  and  to  avoid 
any  pre-planned  search  strategies,  no  specifics  as  to  the  manner  of  identification  or  to 
any  time  constraints  were  given.  The  only  instructions  provided  to  the  subjects  were  to 
have  a  high  degree  of  confidence  in  stating  what  they  thought  the  letter  was.  Verbal 
comments  were  encouraged  throughout  the  search  process  to  assist  in  analyzing  the 
data. 

An  object  set  of  eleven  different  letters  was  used  in  the  data  collection.  The  letters 
are  listed  in  Appendix  B.  The  lower  case  letters  around  the  perimeter  of  each  letter 
corresponds  to  a  particular  feature,  such  as  a  long  straight  edge,  short  straight  edge,  long 
curved  surface,  short  curbed  surface,  acute  angle,  obtuse  angle,  corner,  etc.  These  are 
combined  in  a  string  to  qualify  the  search  strategy  and  quantify  several  measures  of  re¬ 
cognition  (e.g.,  sequential  ratio,  reversal  rate,  recognition  rate)  that  are  discussed  in 
Section  IV.C.l.  The  object  set  was  the  same  as  that  in  the  research  done  by  Acosta  in 
order  to  corroborate  the  results  of  the  haptic-only  search.  The  same  set  was  also  used 
for  the  combined  search  mode  to  more  closely  examine  the  differences  between  the  two 
modes.  To  prevent  the  subjects  from  recognizing  that  the  object  set  was  only  a  subset 
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of  the  entire  range  of  possible  objects,  test  runs  were  also  conducted  using  other  letters. 
The  subjects  were  not  told  of  this  and  all  other  facets  of  the  data  collection  process  were 
identical  so  that  the  subject  was  unaware  that  he  was  being  given  a  "placebo"  which 
would  not  be  used  in  the  data  analysis.  This  also  served  the  purpose  of  exposing  the 
subject  to  more  individual  features  thereby  expanding  the  number  of  possible  choices 
each  would  have  to  make  in  confirming  the  object. 

In  the  haptic  search  mode,  operators  were  visually  and  audially  masked  to  preclude 
receiving  any  cues  from  either  watching  their  hand  movements  or  listening  to  noise 
coming  from  contact  between  the  probe  and  the  object.  In  the  combined  haptic  visual 
search,  the  computer  vision  system  was  used  to  acquire  and  process  the  object  into  a 
digitized  "mosaic"  image.  By  var\  ing  the  pi.xel  block  size  of  the  program  listed  in  .Ap¬ 
pendix  B.  difTering  levels  of  image  grain  were  obtained.  Figure  9  shows  the  letter  "M" 
prior  to  being  digitized.  Figures  10  through  14  show  the  letter  "\F'  which  have  been 
processed  into  various  pixel  block  sizes  ranging  from  16x17  (Figure  10)  to  35x30  pixel 
block  size  (Figure  14).  The  images  produced  by  the  finer  block  sizes  (16x17,  21x20, 
25x20)  did  not  provide  sufficient  image  degradation,  while  the  images  processed  into  the 
31x30  and  35x30  pixel  block  sizes  were  both  adequately  degraded.  The  nearly  square 
31x30  block  pattern  was  considered  more  desirable  than  the  rectangular  35x30  block 
pattern  and  was  selected  for  use  in  this  experiment.  This  image  provided  the  subject  with 
a  general  spatial  sense  of  the  object,  but  not  enough  fine  detail  allow  for  immediate  re¬ 
cognition.  .An  LED  was  fitted  into  the  end  of  the  probe;  its  location  on  the  monitor  gave 
the  subject  a  reference  point  on  the  digitized  image. 

The  actual  digitization  program  took  approximately  six  minutes  to  complete  its 
progression  across  the  monitor  screen.  During  that  time,  one  or  two  haptic-only  runs 
were  conducted.  This  provided  a  more  balanced  sequence  of  tests  and  also  kept  the 
subjects  more  engaged  in  the  experiment  by  not  having  so  much  "dead  time"  (relative  to 
the  actual  time  spent  on  each  individual  run). 

.After  establishing  proficiency  using  the  telemanipulator,  actual  data  collection  was 
done  with  each  subject  in  one  to  one  and  one-half  hour  time  blocks  over  a  period  of 
several  weeks.  Both  physical  and  mental  capacity  to  perform  dextrous  tasks  rapidly  de¬ 
teriorates  after  longer  sessions.  By  collecting  data  for  a  shorter  period  of  time  on  an 
ever}’  other  day  basis,  subjects  maintained  both  their  manipulator  skill  and  interest  in  the 
experiment,  and  the  time  off  between  sessions  prevented  the  subjects  from  "memorizing" 
feature  sequences  of  particular  letters. 
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Figure  12.  Letter  'M'*  Digitized  into  25x20  Pixel  Block  Size 
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Figure  14.  Letter  'M'  Digitized  into  35x30  Pixel  Block  Size 
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E.  DATA  COLLECTION  AND  PROCESSING 

Detailed  analysis  of  the  search  strategy  during  each  run  required  the  ability  to 
identify  every  individual  feature  probed.  To  accomplish  this,  a  video  recording  of  each 
subject's  test  series  was  made,  along  with  a  recording  of  their  verbal  comments.  .After 
all  subjects  had  completed  their  runs,  the  video  tape  was  viewed  in  order  to  identify  the 
search  mode.  Figure  15  is  an  example;  the  string  listed  corresponds  to  the  search  path 
of  the  letter  "F"  by  Subject  =3  for  both  the  haptic-only  and  combined  haptic  visual 
search.  Similar  strings  for  all  runs  of  each  subject  were  made.  I  he  results  are  compiled 
in  .Appendi.x  C.  which  lists  the  raw  data  for  each  subject  broken  down  by  letter  and  by 
quantitative  measure  of  recognition. 

Extreme  care  had  to  be  taken  in  analyzing  the  videotapes.  The  frequent  rapid  back 
and  forth  '  motion  of  the  probe  made  it  difTicult  to  determine  whether  or  not  a  particular 
feature  had  been  identified.  .Also,  the  operator  s  ability  to  maintain  contact  with  the 
edge  of  the  letter  as  the  probing  progressed  along  an  external  corner  was  not  always 
observed  on  the  tape  in  real-time  speed.  Either  a  slow  motion  or  frame-by-frame  replay 
was  necessary  to  ermine  whether  the  subject  had  in  fact  known  he  had  identified  a 
corner.  Verba!  .omments  were  useful  in  clarifying  this,  but  more  often  than  not.  a  se¬ 
quential  fiame  review  of  the  data  was  required. 
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HAPTIC 

[bcdcbcdfedeghijkjijkikjtjkimljkim 

nopqpqrsrqrqpqpolktmnopqpojkjihg] 


COMBINED 

[bcbabcdedefhihghihghgfhgehijkjijkjijkjihijkimnk] 


Figure  15.  Search  Strings  for  Letter 'F" 
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IV.  RESULTS  AND  DISCUSSION 
A.  MEASURES  OF  RECOGNITION 

Six  specific  categories  of  recognition  were  used  for  comparison  between  the  search 
modes.  These  are  the  sequence  ratio,  the  total  number  of  features  probed,  time  to  re¬ 
cognition.  number  of  reversals,  recognition  rate,  and  reversal  rate.  Two  of  these,  the 
sequence  ratio  and  the  total  number  of  features  probed,  were  also  used  to  compare  the 
results  obtained  in  the  haptic-only  search  with  that  done  by  .Acosta. 

The  sequence  ratio,  Sr,  defined  as  the  total  number  of  sequential  probings  divided 
by  the  total  number  of  features  probed  minus  one.  was  used  by  .Acosta  to  compare  the 
highly  ordered  haptic  search  to  the  random  full  visual  search.  The  total  number  of  fea¬ 
tures  probed,  time  to  recognition,  and  the  number  reversals  -  reversals  defined  as  the 
number  of  changes  in  direction  during  a  run  -  were  considered  "raw"  data;  these  three 
quantities  are  very  dependent  upon  the  object  to  be  identified  and  the  operator's  tech¬ 
nique.  To  get  a  better  comparison  of  an  individual  subject  's  overall  search  strategy,  two 
"normalized"  rate-based  results  were  calculated.  The  recognition  rate  is  defined  as  the 
total  number  of  features  probed  divided  by  the  time  to  recognition.  This  provides  a 
quantitative  measure  of  the  'quickness"  of  a  subject's  probing.  The  reversal  rate,  defined 
as  the  number  of  reversals  divided  by  the  total  number  of  features  probed,  is  a  numerical 
expression  for  the  "back  and  forth"  probing  technique  frequently  exhibited  by  all  sub¬ 
jects.  These  two  rates  are  interdependent  and  must  be  considered  together.  .A  high  re¬ 
cognition  rate  does  not  necessarily  mean  that  a  subject  is  rapidly  exploring  the  entire 
object;  if  the  reversal  rate  is  also  high  for  that  same  letter,  it  is  an  indication  of  a  fixation 
over  a  particular  group  of  features  of  that  letter  (such  as  the  "v-shape"  at  the  top  of  the 
letter  "M").  Multiple  explorations  over  a  restricted  area  indicates  that  the  subject  does 
not  have  a  good  feel  for  the  "big  picture"  and  is  using  an  established  reference  point  to 
build  up  confidence  prior  to  continuing  his  exploration.  On  the  other  hand,  a  low  re¬ 
cognition  rate  does  not  mean  that  the  subject  is  having  difficulty  identifying  the  object. 
He  may  have  an  estimate  of  the  next  feature  to  be  searched  and  is  taking  his  time  to 
confirm  that  supposition.  However,  if  the  reversal  rate  is  high,  the  deliberate,  thoughtful 
search  pattern  may  mean  that  the  subject  does  not  have  a  good  idea  of  the  letter,  and 
is  repeating  his  search  over  a  small  area  in  hopes  of  finding  a  unique  feature  (such  as  the 
cusp  of  the  "B",  or  the  tail  of  the  "Q")  which  may  lead  to  immediate  recognition. 
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B.  HAPTIC  COMPARISON 


Table  1  summarizes  the  data  obtained  from  the  haptic-only  search.  The  average 
sequence  ratio  all  three  subjects.  93.3" •>.  is  comparable  to  the  value  of95"o  obtained  by 
.Acosta.  The  average  of  all  three  subjects'  total  number  of  features  probed  is  80.9.  also 
comparable  to  the  .Acosta  result  of  87.0  features,  fhese  similar  results  confirm  that  a 
haptic-only  search  is  an  adequate  substitute  for  the  foveal  vision  component  in  a  hybrid 
sensory  system. 


Table  1.  RESULTS  FOR  HAPTIC-ONLY  SEARCH 


SUBJECT  =1 

SUBJECT  ti2 

Sequence  Ratio 
(%) 

96.39 

89.82 

.A\  e  =  of  Features 

122.0 

52. 1 7 

68.43 

.Ave  Recognition  Time 
(seconds) 

121.6 

86.00 

68.57 

.Ave  Recognition  Rate 
(features  second) 

1.000 

0.609 

1.056 

.Ave  =  of  Reversals 

22.00 

9.50 

22.57 

.Ave  Reversal  Rate 
(reversals  second) 

0.175 

0.193 

0.353 

C.  COMBINED  VISUAL/HAPTIC  SEARCH 
1.  Data  Analysis 

Table  2  lists  the  results  of  the  combined  search  for  each  of  the  subjects.  These 
numbers  are  averaged  from  the  compilation  of  data  contained  in  .Appendix  C,  which  lists 
the  results  for  each  subject  search  of  each  object  broken  down  into  all  six  calculated 
categories. 

a.  Total  Number  of  Features  Probed 

The  average  number  of  features  probed  per  object  was  noticeably  greater  in 
all  three  subjects  for  the  haptic  search  than  for  the  combined  search.  The  increase 
ranged  from  18%  to  40® o  and  averaged  almost  29‘!o.  The  average  in  the  haptic-only 
search  was  within  10%  of  Acosta's  results.  The  range  of  the  number  of  features  probed 
in  each  subject's  search  varied  widely  for  different  letters,  reflecting  the  innate  differences 
between  operator  skills  and  probing  techniques.  Further,  while  every  effort  was  made 
to  ensure  standardization  of  test  procedures,  the  changing  nature  of  physiol  _ical  and 
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Table  2.  RESULTS  FOR  COMBINED  HAPTIC/VISUAL  SEARCH 


SUBJECT??! 

Sequence  Ratio 
(%) 

93.77 

98.20 

92.24 

.Ave  =  of  Features 

S7.09 

42.S6 

41.40 

•Ave  Recognition  fime 
(seconds) 

122.5 

88.86 

49.70 

.Ave  Recognition  Rate 
(features  second) 

0.762 

0.507 

0.82S 

.Ave  tf  of  Reversals 

14.(»0 

5.29 

15.3U 

.Ave  Reversal  Rate 
(reversals  second! 

0.164 

mam 

0.347 

psychological  inllucnces  from  day  to  day  (i.e.,  from  session  to  session*  may  have  affected 
the  subject's  performance.  Therefore,  no  direct  correlation  of  the  raw  number  results 
of  this  category  is  considered  appropriate.  The  standardized  rate-based  results  of  re¬ 
cognition  rate  are  more  accurate  measures  of  the  true  nature  of  each  subject  s  elTorts. 
b.  Time  to  Recognition 

The  time  to  recognition  is  considered  the  most  subjective  measure  of  a 
subject's  search  strategy.  The  nature  of  the  peripheral  component  as  modeled  by  the 
computer  digitization  process  makes  it  possible  that  the  altered  image  could  be  more 
"clear  "  to  one  observer  than  another.  This  characteristic  may  manifest  itself  in  a  sub¬ 
ject's  search  technique.  If  one  subject  has  a  better  intuitive  feel  for  what  the  digitized 
image  is  than  does  another  subject,  his  search  should  be  more  concise  and  he  is  likely 
to  mentally  process  a  greater  amount  of  information  in  the  same  time  frame.  This  trait 
could  also  be  reflected  in  a  relatively  higher  recognition  rate. 

The  time  to  recognition  for  the  first  and  second  subjects  was  almost  the 
same  (within  3%)  in  both  search  modes  whereas  the  third  subject's  average  recognition 
time  was  over  27%  faster  in  the  combined  search  mode,  where  the  visual  peripheral 
component  was  available.  Since  Subject  #3  also  had  a  9%  higher  recognition  rate  than 
Subject  #1,  and  a  63%  higher  recognition  rate  than  did  Subject  #2,  it  can  be  assessed 
that  Subject  it}  had  a  better  ability  to  distinguish  the  digitized  image  prior  to  com¬ 
mencing  haptic  search. 
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c.  Recognition  Rate 

In  all  three  subjects,  the  recognition  rate  was  considerably  higher  -  ranging 
from  20%  to  31®o  -  for  the  haptic-only  search  than  for  the  combined  search.  This  in¬ 
dicated  that  the  addition  of  the  peripheral  vision  component  docs  not  increase  the  sub¬ 
ject's  ability  to  receive  and  process  information  at  faster  speeds.  Providing  another 
sensory  input  actually  slowed  down  the  data  gathering  process.  The  phenomenon  of 
"sensory  overload",  where  so  much  information  is  presented  to  the  viewer  that  he  is  un¬ 
able  to  make  use  of  any  of  it.  is  a  potential  problem.  This  habit  was  sometimes  notice¬ 
able  during  the  combined  search.whcn  the  subject  would  stop  probing  and  concentrate 
his  attention  towards  the  monitor.  This  increased  the  time  spent  in  the  search  and 
consequently  yield  a  lower  recognition  rate  than  if  the  subject  had  maintained  a  contin¬ 
ual  movement  of  the  telemanipulator.  .Although  the  hy  brid  vision  sy  stem  did  not  cause 
complete  sensory  overload,  a  real  world  operator  of  such  a  system  would  have  other 
sensory  inputs  and  distractions.  The  chance  saturating  the  information  processing  abil¬ 
ity  would  might  exist,  and  therefore  should  be  considered  in  the  design  of  the  equipment. 

d.  Number  of  Reversals 

Each  subject  had  from  a  50®  o  to  S0®o  increase  in  the  number  of  reversals 
in  the  combined  search  over  the  haptic-only  search.  Much  as  the  time  to  recognition 
could  not  be  examined  without  reference  to  .he  recognition  rate,  the  number  of  reversals, 
as  a  "raw"  figure,  is  more  useful  when  considered  in  the  context  of  reversal  rate. 

e.  Reversal  Rate 

The  reversal  rates  seen  in  the  haptic  mode  varied  from  7®o  to  57®  o  higher 
than  in  the  combined  search.  The  large  range  of  values  is  attributed  to  the  differences 
in  operator  proficiency,  and  perhaps  more  importantly,  the  degree  to  which  a  required 
confidence  level  is  needed  prior  to  stating  what  the  object  is  thought  to  be.  Though  a 
wide  range  did  exist  in  the  size  of  the  rate  increase,  the  fact  that  the  all  three  subjects 
had  higher  rates  in  the  haptic-only  mode  shows  the  quantitative  nature  of  "back  and 
forth"  probing,  a  technique  seen  in  much  greater  frequency  when  subjects  had  no  visual 
cue.  This  suggests  a  desire  to  reconfirm  individual  features  more  frequently  .  Certain 
features,  particularly  obtuse  exterior  angles  such  as  the  sides  of  the  "X",  were  prone  to 
many  repetitive  probings,  in  large  part  (based  on  verbal  comments)  on  the  need  to  check 
that  there  indeed  was  an  angulation,  and  not  just  a  long  straight  edge.  While  the  dis¬ 
tinction  between  a  straight  edge  and  a  near  180°  angle  would  not  have  shown  up  in  the 
digitization  process,  the  ability  of  the  subjects  to  view  the  LED  would  have  provided  a 
discernable  clue. 
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f.  Sequence  Ratio 

In  each  of  the  subjects'  probings,  the  sequence  ratio  for  haptic-only 
searching  was  just  a  fraction  below  that  of  their  results  for  combined  search.  Overall, 
the  average  sequence  ratio  for  haptic-only  probing  ,  93.3‘’o,  is  comparable  to  that  for  the 
hybrid  search.  94.7'’ o.  Both  of  these  results  arc  quite  similar  to  the  95“  o  result  obtained 
by  .Acosta.  The  fact  that  the  values  are  almost  identical  in  both  modes  was  quite  sur¬ 
prising  and  indicates  an  overwhelming  reliance  of  the  subject  on  haptically  acquired  in¬ 
formation.  This  contrasts  the  premise  of  this  work  that  the  sequence  ratio  would  be 
closer  to  the  10“  o  value  found  in  previous  work  for  full  vision  search  rather  than  almost 
95“  0  value  in  the  haptic-only  mode.  The  discovery  that  the  sequence  ratio  for  the 
combined  system  does  not  tend  at  all  in  the  direction  of  full  visual  search  is  concluded 
to  he  the  most  convincing  result  that  the  proposed  model  does  not  provide  the  equiv¬ 
alent  of  a  full  vision  system. 

2.  Search  Strategies 

While  an  attempt  was  made  to  quantify  all  search  techniques,  several  qualitative 
observations  were  noteworthy.  The  subject's  initial  e.vposure  with  the  combined  system 
was  much  slower  than  succeeding  runs  using  both  inputs.  The  need  to  conduct  thorough 
training  on  the  telemanipulator  prior  to  contmencing  data  collection  is  an  unavoidable 
reinforcement  of  the  haptic  reliance.  When  subjects  were  first  exposed  to  the  hybrid 
search,  there  seemed  to  be  a  tendency  on  the  operator's  part  "not  to  believe  their  eyes  '. 
The  subjects  were  observed  performing  similar  strategy  as  they  had  in  the  haptic-only 
mode,  but  they  appeared  to  be  concentrating  so  intently  on  the  visual  cue.  that  they  re¬ 
peated  their  probings  more  often  than  they  would  after  multiple  sessions  with  the  com¬ 
bined  sensory  input. 

In  an  attempt  to  verify  if  manipulator  practice  had  prejudiced  the  subjects  to¬ 
wards  over-reliance  on  the  ha(>tic  input,  an  observer  who  had  no  familiarity  with  any 
portion  of  the  experiment  was  chosen  as  a  "control"  check.  Using  the  same  equipment 
set-up  and  providing  the  same  guidelines  as  in  the  rest  of  the  data  collection,  this  subject 
was  given  no  practice  on  the  manipulator.  By  allowing  this  "control "  to  immediately  use 
the  hybrid  system,  it  was  hoped  to  show  whether  any  undue  influence  would  be  attri¬ 
buted  to  either  manipulator  training  or  performing  the  haptic  search  first.  However,  no 
such  conclusion  could  be  drawn,  as  the  control  subject  exhibited  a  highly  sequential 
search  strategy  even  in  the  presence  of  the  visual  cue.  While  this  sample  size  is  statis¬ 
tically  insignificant,  the  fact  that  the  initial  efforts  of  this  "unbiased"  search  were  so 
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similar  to  the  actual  test  subjects  is  further  evidence  of  the  inherent  reliance  on  the 
haptically  acquired  data. 

D.  COMPARISONS  BETWEEN  SEARCH  MODES 

The  most  obvious  comparison  between  the  haptic  and  combined  search  is  the  almost 
identical  sequential  nature  in  both  modes.  Based  upon  previous  work,  where  the  se¬ 
quence  ratio  for  haptic-only  search  was  95‘'o  and  for  full  visual  search  near  lO'^n,  it  was 
believed  that  the  results  for  a  h\brid  system  would  lie  somewhere  in  between.  The  fact 
that  almost  no  distinction  between  the  haptic  search  and  combined  search  results  can 
be  made  is  not  totally  without  precedence,  however,  as  it  has  been  noted  that  human 
divers  rely  extensively  on  their  sense  of  touch  to  accomplish  their  tasks.  Since  much  of 
their  work  is  done  in  the  absence  of  visual  contact,  the  need  for  well-developed  dextrous 
skill  is  essential. 
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V.  CONCLUSIONS  AND  RECOMMENDATIONS 


A.  CONCLUSIONS 

-The  addition  of  a  peripheral  vision  component  (as  modeled  by  the  computer  vision) 
to  a  haptic  input  simulating  the  foveal  vision  is  not  equivalent  to  full  visual  search. 

-In  spite  of  the  presence  of  a  visual  sensory  input,  subject  reliance  on  tactile  response 
is  the  predominant  means  of  deriving  localized  feature  information. 

B.  RECOMMENDATIONS 

-Reliance  on  haptically  acquired  data  is  consistent  with  human  divers  experience. 

-The  haptic  sensorx  channel  provides  the  most  reliable  source  of  remote  object  iden¬ 
tification,  and  future  research  into  the  man-machine  interface  of  ROVs  should  focus  on 
improving  haptic  search  capabilities. 

-The  substitution  of  a  sonar-type  sensor  for  the  peripheral  vision  component  should 
be  tested  to  develop  a  better  model  for  a  hybrid  full  vision  system. 

-Future  thesis  work  involving  the  control  of  telcmanipulators  and  sensorx  input  should 
involve  live  testing  with  teleoperation  devices  coordinated  through  NOSC  San  Diego. 
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APPENDIX  A.  COMPUTER  DIGITIZATION  PROGRAM 
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APPENDIX  B.  OBJECT  IDENTIFICATION  FEATURES 
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APPENDIX  C.  EXPERIMENTAL  RESULTS 


Mi:  ASL  Ri;  or 
RrCOGMIlON 

SUBJECT  =1 

SUBJECT  =2 

SUBJECT  =3 

I  i;ttcr:  C 

Haptic 

Comb 

Comb 

Haptic 

Comb 

Sequential 

Features 

110 

103 

39 

26 

15 

I'otal  Features 
Explored 

125 

110 

41 

29 

16 

Sequence  Ratio 

95.97 

94.50 

97.50 

92.86 

100.0 

I  ime  to  Recognition 
( sec ) 

137 

118 

85 

33 

24 

Recognition  Rate 
(features  sec) 

0.912 

0.932 

0.482 

0.879 

0.667 

Number  of 

Rexersals 

25 

21 

6 

13 

S 

Reversal  Rate 
(reversals  feature) 
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SUBJECT 


S9.33  HK).0 


105  41 


0.2S2  0.724  0.561 


0.034  0.395  0.34S 


ME.XSURE  OE 
RECOGNITION 

SUBJECT  » I 

Letter:  M 

Haptic 

Comb 

Sequential 

Features 

33 

S3 

Total  Features 
Explored 

41 

90 

Sequence  Ratio 

(“o) 

82.50 

93.26 

Time  to  Recognition 
(sec) 

46 

189 

Recognition  Rate 
( features  sec) 

0.891 

0.476 

Number  of 
Reversals 

4 

10 

Reversal  Rate 
(reversals  feature) 

0.098 

0.111 

SUBJECT  =3 

1  laptic 

Comb 

91 

44 

57 

117 

92.86 

78.45 

105 

130 

0.543 

0.900 

11 

48 

0.193 

0.410 

52 


VHIASL  RE  OF 
RECOGNITION 

l.cttcr:  X 


SE  BJECT  “I 


SUBJECT 


SUBJECT  =3 


REFERENCES 


Acosta.  J.C.,  Modeling  of  Explonuaiive  Procedures  for  Remote  Object  Identification. 
Master's  Thesis,  Naval  Postgraduate  School,  .Monterey,  C.\,  September  1991. 

Beierl,  P.G.,  liniie  Memory  Model  for  Haptic  Recognition.  Master  s  Thesis,  Naval  Post¬ 
graduate  School,  Monterey,  C.A,  December  1991. 

Driels,  M.R.,  and  Spain,  II..  "Haptic  Recognition  Through  Teleoperation  '.  Ergmiomics 
if  Hybrid  Automated  Systems  //.  Karwowski  and  Rashimi.  eds..  hlsevicr  Science  Pub¬ 
lishers.  1990. 

I lacisalih.i!ade.  S.S..  Stark.  L.W..  and  .-Mien.  J.S..  "Visual  Perception  and  Sequences  of 
Eye  Movement  Fixations:  .-X  Stochastic  Modeling  .Approach'.  Submitted  for  IEEE 
Transactions  on  Systems.  Man.  and  Cybernetics.  Swiss  Federal  Institute  of  Technology. 
Zurich.  Switzerland,  6  November  1990. 

Gibson.  J.J..  The  Senses  Considered  as  Perceptual  Systems,  Houghton  MilTlin.  1966. 

Johnsen.  E.G.  and  Corliss,  W.R..  Human  Factors  .Ipplications  in  Teleoperation  Design 
and  Operation,  Wiley,  1971. 

Noton,  D.  and  Stark.  L.W.,  "Eye  Movements  and  Visual  Perception".  Scientific 
.Irnerica,  Vol  224.  No  6.  pp.34-43,  June  1971. 


i 


56 


r  I  ro 


INITIAL  DISTRIBUTION  LIST 


No.  Copies 

i.  Defense  Technical  Information  Center  2 

Cameron  Station 
.Alexandria,  VA  22304-6145 

Librarx'.  Code  52  2 

Naval  Postgraduate  School 
Monterey.  CA  93943-5002 

Department  Chairman,  Code  ME  1 

Department  of  .Mechanical  Engineering 
Naval  Postgraduate  School 
Monterey.  C.\.  93943-50(mi 

4.  Professor  .Morns  R.  Driels.  Code  ME  Dr  2 

Department  of  Mechanical  Engineering 

Naval  Postgraduate  School 
Monterey.  CA.  93943-500(3 

5.  Naval  Engineering  Curricular  OlTice.  Code  34  1 

Department  of  Mechanical  Engineering 

Naval  Postgraduate  School 
Monterey.  CA.  93943-5(;»00 

6.  Professor  Lawrence  Stark  1 

483  Minor  Hall 

L'niversitv  of  California 
Berkeley. 'CA,  94270 

7.  David  C.  Smith  1 

NR.AD.Code  531 

271  Catalina  Blvd 

Bayside  Bldg  54 

San  Diego,  CA,  92152-5000 

8.  Lieutenant  David  L.  Klein,  L'SN  1 

2858  Canyon  Falls  Drive 

Jacksonville,  EL.  32224 


57 


